Skip to main content

Incident Report: Frequent Dead RPC URL Alerts for Kava's Public RPC

Date: 2024-01-02
Time: 12:34 (GMT+3)
Duration: 23 minutes

Description

Frequent alerts were reported for a Dead RPC URL detected for the Kava's public RPC. The alerts kept opening and auto-closing within 15 minutes over a half-day period, and also reached the maximum count limit during the open time.

Root Cause

The root cause was identified as an issue with Kava's public RPC URL. The alerts were triggered due to the RPC not being functional intermittently over the reported period.

Impact

The frequent alerts could have caused monitoring fatigue or confusion, although there was no immediate impact on operations since Reblok, the primary RPC, was functional and the public RPC is only a backup/fallback.

Timeline

  • 12:34 - Andrew reported the issue with frequent Dead RPC URL alerts.
  • 12:57 - Aaron acknowledged the problem and suggested considering a swap for a more reliable public RPC.

Lessons Learned

Even backup or fallback systems can cause alert fatigue and need monitoring for reliability. Regular checks and updates to these systems can prevent unnecessary alerts and ensure they are ready to function effectively when needed.

Actions Taken

  1. Monitoring and reporting of the frequent Dead RPC URL alerts for Kava's public RPC.
  2. Confirmation that primary operations were not affected due to Reblok functionality.
  3. Discussion and consideration of swapping out the unreliable public RPC for a more stable one.

Incident Reviewer(s)

  • Andrew Prasaath (Reported and followed up on the issue)
  • Bedirhan (Provided initial assessment)
  • Aaron (Acknowledged and suggested a potential solution)